Gender and affect recognition based on GMM and GMM-UBM modeling with relevance MAP estimation

نویسندگان

  • Rok Gajsek
  • Janez Zibert
  • Tadej Justin
  • Vitomir Struc
  • Bostjan Vesnicer
  • France Mihelic
چکیده

The paper presents our efforts in the Gender Sub-Challenge and the Affect Sub-Challenge of the INTERSPEECH 2010 Paralinguistic Challenge. The system for the Gender Sub-Challenge is based on modeling the Mel-Frequency Cepstrum Coefficients using Gaussian mixture models, building a separate model for each of the gender categories. For the Affect Sub-Challenge we propose a modeling schema where a universal background model is first trained an all the training data and then, employing the maximum a posteriori estimation criteria, a new feature vector of means is produced for each particular sample. The feature set used is comprised of low level descriptors from the baseline system, which in our case are split into four subsets, and modeled by its own model. Predictions from all subsystems are fused using the sum rule fusion. Aside from the baseline regression procedure, we also evaluated the Support Vector Regression and compared the performance. Both systems achieve higher recognition results on the development set compared to baseline, but in the Affect Sub-Challenge our system’s cross correlation is lower than that of the baseline system, although the mean linear error is slightly superior. In the Gender Sub-Challenge the unweighted average recall on the test set is 82.84%, and for the Affect Sub-Challenge the cross-correlation on the test set is 0.39 with mean linear error of 0.143.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study on the Relevance Factor of Maximum a Posteriori with GMM for Language Recognition

In this paper, the relevance factor in maximum a posteriori (MAP) adaptation of Gaussian mixture model (GMM) from universal background model (UBM) is studied for language recognition. In conventional MAP, relevance factor is typically set as a constant empirically. Knowing that relevance factor determines how much the observed training data influence the model adaptation, thus the resulting GMM...

متن کامل

A discriminative performance metric for GMM-UBM speaker identification

Gaussian mixture modeling with universal background model (GMM-UBM) is a widely used method for speaker identification, where the GMM model is used to characterize a specific speaker’s voice. The estimation of model parameters is generally performed based on the maximum likelihood (ML) or maximum a posteriori (MAP) criteria. In this way, interspeaker information that discriminates between diffe...

متن کامل

Discriminant Approaches for Gmm Based Speaker Detection Systems

This paper presents some experiments on discriminative training for GMM/UBM based speaker recognition systems. We propose two MMIE adaptation methods for GMM component weights suitable for speaker recognition. The impact on performance of this training methods is compared to the standard weight estimation/adaptation criterion, MLE and MAP on standard GMM based systems and on SVM based systems. ...

متن کامل

Maximum Entropy Based Data Selection for Speaker Recognition

This paper presents the data selection method for speaker recognition. Since there is no promise that more data guarantee better results, the way of data selection becomes important. In the GMM-UBM speaker recognition, the UBM is trained to represent the speaker-independent distribution of acoustic features while the GMM speaker model is tailored for a specific speaker. In this study of data se...

متن کامل

Modeling high-level information by using Gaussian mixture correlation for GMM-UBM based speaker recognition

The Gaussian mixture model-universal background model (GMM-UBM) has been dominant in text-independent speaker recognition tasks. However the conventional GMM-UBM method assumes that each Gaussian mixture is independent and ignores the fact that within Gaussian mixtures, there do exist some useful high-level speaker-dependent characteristics, such as word usage or speaking habits. Based on the G...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010